# 1 2 3 4 5 6 7
c(70, 66, 82, 85, 78, 90, 73)2A: Vector
Readings
From R Coding Basics: An Introduction to the Basics of Coding in R by Dr. Gaston Sanchez:
Topics
Vectors
Atomic types
Special values
Creating vectors with
c(),:,seq(), andrep()Useful functions for numeric vectors
Built-in vectors
Basic data structures
- Basic data structures in R include vector, factor, matrix, array, data frame, and list.
- These structures are characterized by their dimension and whether they require all elements to be of the same atomic type.
| Structure | Dimension | Same Atomic Type |
|---|---|---|
| Vector | 1 | Yes |
| Factor | 1 | Yes |
| Matrix | 2 | Yes |
| Data Frame | 2 | No |
| Array | \(\ge\) 2 | Yes |
| List | 1 | No |
Vector
A vector is a sequence of elements that are of the same atomic type.
In R, the index of the first element is always 1.
- A single value is treated as a vector of one element
70 # same as c(70)Atomic types
An atomic type refers to the six fundamental types in R:
logical,integer,double,character,raw, andcomplex.Note that
integeranddoubleare also known asnumeric.
# logical vector
c(TRUE, FALSE, FALSE, TRUE, TRUE)
c(T, F, F, T, T)
# integer vector (numeric)
c(1L, 3L, 2L, 4L, 2L)
# double vector (numeric)
c(6.3, 8.2, 3.1, 4.4, 7.6)
# character vector
c("apple", "orange", "apple", "apple", "orange")
c('dog', 'cat', 'dog', 'dog', 'cat')
# raw and complex exist but not very popularThe functions
typeof()andstorage.mode()tells us the atomic type of a vector.The function
mode()works similarly, except that it returnsnumericfor bothintegeranddouble.
typeof(2.3)[1] "double"
storage.mode(2.3)[1] "double"
mode(2.3)[1] "numeric"
- There are also dedicated checking function for each atomic type.
is.logical()
is.integer() # integer but not double
is.double() # double but not integer
is.numeric() # either integer or double
is.character()💻 Hands-On
Try the following R code to see what it returns.
enrollment <- c(10, 30, 15, 20)
typeof(enrollment)
storage.mode(enrollment)
mode(enrollment)
is.integer(enrollment)
is.double(enrollment)
is.numeric(enrollment) Note that enrollment contains whole numbers but without writing c(10L, 30L, 15L, 20L), R still thinks of them as double.
enrollment <- c(10, 30, 15, 20)
typeof(enrollment)[1] "double"
storage.mode(enrollment)[1] "double"
mode(enrollment)[1] "numeric"
is.integer(enrollment)[1] FALSE
is.double(enrollment)[1] TRUE
is.numeric(enrollment) [1] TRUE
Special values
NULLindicates an undefined objectNAindicates missing or “not available” valueNaNindicates an object that is “not a number”Infindicates positive infinite-Infindicates negative infinite
💻 Hands-On
Try the following R code to see what it returns.
sqrt(-7)
log(-5)
0 / 0
100 / 0
-100 / 0
log(0)sqrt(-7)[1] NaN
log(-5)[1] NaN
0 / 0[1] NaN
100 / 0[1] Inf
-100 / 0[1] -Inf
log(0)[1] -Inf
Creating vectors
- As shown previously, a vector can be manually created using the combine
c()function.
# logical vector
c(TRUE, FALSE, FALSE, TRUE, TRUE)
c(T, F, F, T, T)
# integer vector (numeric)
c(1L, 3L, 2L, 4L, 2L)
# double vector (numeric)
c(6.3, 8.2, 3.1, 4.4, 7.6)
# character vector
c("apple", "orange", "apple", "apple", "orange")
c('dog', 'cat', 'dog', 'dog', 'cat') Elements in a vector can have names!
We can give names directly in
c()
c(exam1 = 90, exam2 = 85, final = 92)exam1 exam2 final
90 85 92
- We can also create the vector and assign names later.
scores <- c(90, 85, 92)
names(scores) <- c('exam1', 'exam2', 'exam3')
scoresexam1 exam2 exam3
90 85 92
💻 Hands-On
Use
c()to create a short vector for each of the four atomic types:integer,double,logical, andcharacter.Assign each vector to a variable with a descriptive name.
Choose one of your vectors and assign names to its elements.
# integer
experience <- c(1L, 3L, 5L, 2L)
# double
weight <- c(143.5, 150, 127.3, 133.5)
# logical
in_stock <- c(TRUE, FALSE, FALSE, TRUE)
# character
student_levels <- c('Junior', 'Freshman', 'Junior', 'Senior')Creating numeric vectors
The colon operator
- The colon operator
:generate a numeric sequence of one-unit steps by
### start:end (end is like an upper/lower bound)
-2:5 # start with -2, increase by 1
5:-2 # start with 5, decrease by 1
3.7:9.2 # start with 3.7, increase by 1💻 Hands-On
Use the colon operator : to quickly create the following vectors
c(3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
c(17, 16, 15, 14, 13, 12, 11, 10, 9)Since the vectors contain consecutive elements, the colon operator : is useful.
3:12 [1] 3 4 5 6 7 8 9 10 11 12
17:9 [1] 17 16 15 14 13 12 11 10 9
The seq() function
- The
seq()function generates a numeric sequence of more general steps.
# step size of 2
seq(from = -2, to = 5, by = 2) [1] -2 0 2 4
# step size of 0.75
seq(from = -2, to = 5, by = 0.75) [1] -2.00 -1.25 -0.50 0.25 1.00 1.75 2.50 3.25 4.00 4.75
# steps are automatically adjusted
seq(from = -2, to = 5, length.out = 6) [1] -2.0 -0.6 0.8 2.2 3.6 5.0
💻 Hands-On
Use the seq() function to create the vector that
Starts at
5and ends at-3, increasing or decreasing by1Starts at
-1and ends at7, with a step size of1.5.Starts at
0and ends at14, containing5equally spaced values.Starts at
-4and ends at4, including only every other number.
seq(from = 5, to = -3, by = -1)[1] 5 4 3 2 1 0 -1 -2 -3
seq(from = -1, to = 7, by = 1.5)[1] -1.0 0.5 2.0 3.5 5.0 6.5
seq(from = 0, to = 14, length.out = 5)[1] 0.0 3.5 7.0 10.5 14.0
seq(from = -4, to = 4, by = 2)[1] -4 -2 0 2 4
The rep() function
- The
rep()function creates vectors with repeated elements.
# repeat -1 five times
rep(-1, times = 5) [1] -1 -1 -1 -1 -1
# repeat c(-1, 0, 3) four times
rep(c(-1, 0, 3), times = 4) [1] -1 0 3 -1 0 3 -1 0 3 -1 0 3
# repeat -1 two times, 0 three times, 3 four times
rep(c(-1, 0, 3), times = c(2, 3, 4)) [1] -1 -1 0 0 0 3 3 3 3
# repeat -1 five times, 0 five times, 3 five times
rep(c(-1, 0, 3), each = 5) [1] -1 -1 -1 -1 -1 0 0 0 0 0 3 3 3 3 3
💻 Hands-On
Use the rep() function to create the vector in which
Each value in the vector
c(1, 3, 6)is repeated exactly 4 timesThe value
4is repeated6timesThe vector
c(2, −1, 1)is repeated3timesThe values in
c(5, 0, −2)are repeated so that5appears once,0appears three times, and-2appears four times.
Answer:
rep(c(1, 3, 6), each = 4) [1] 1 1 1 1 3 3 3 3 6 6 6 6
rep(4, times = 6)[1] 4 4 4 4 4 4
rep(c(2, -1, 1), times = 3)[1] 2 -1 1 2 -1 1 2 -1 1
rep(c(5, 0, -2), times = c(1, 3, 4))[1] 5 0 0 0 -2 -2 -2 -2
Summary functions for numeric vectors
Consider a vector
# 1 2 3 4 5 6 7
v <- c(70, 66, 82, 85, 78, 90, 73)length()returns its lengthmin()returns its minimum valuemax()returns its maximum valuewhich.min()returns the index of its minimum valuewhich.max()returns the index of its maximum valuesum()returns the sum of its elementsprod()returns the product of its elements
length(v)[1] 7
min(v)[1] 66
which.min(v)[1] 2
max(v)[1] 90
which.max(v)[1] 6
sum(v)[1] 544
prod(v)[1] 1.650193e+13
💻 Hands-On
Consider a vector containing calorie content of light beer brands. Write R code to answer the following questions:
How many light beer brands are included?
What is the lowest calorie content among the light beers?
What is the highest calorie content among the light beers?
At which position does the highest calorie value occur?
What is the total calorie content across all light beer brands?
# Calories per 100ml of light beer
beer_cals <- c(29, 28, 33, 31, 30, 33, 30, 28, 27, 41, 39, 31, 29,
23, 32, 31, 32, 19, 40, 22, 34, 31, 42, 35, 29, 43)# Number of light beer brands
length(beer_cals)[1] 26
# Lowest calorie content
min(beer_cals)[1] 19
# Highest calorie content
max(beer_cals)[1] 43
# Position of the highest calorie value
which.max(beer_cals)[1] 26
# Total calorie content
sum(beer_cals)[1] 822
💻 Hands-On
Try the following R code to see what it returns.
# 1 2 3 4 5 6 7
v <- c(70, 66, 82, 85, NA, 90, 73)
length(v)
min(v)
which.min(v)
max(v)
which.max(v)
sum(v)
prod(v)Note that the vector v contains a missing value NA
# 1 2 3 4 5 6 7
v <- c(70, 66, 82, 85, NA, 90, 73)Therefore, some functions return NA. The functions which.min() and which.max() exclude missing values first so they return some values.
length(v)[1] 7
min(v)[1] NA
which.min(v)[1] 2
max(v)[1] NA
which.max(v)[1] 6
sum(v)[1] NA
prod(v)[1] NA
Built-in vectors
- R includes several built-in vectors for alphabets, \(\pi\), months, and US states.
💻 Hands-On
Try the following R code and see what it returns. Feel free to get the help documentation.
LETTERS
letters
month.abb
month.name
pi
state.abb
state.name
state.areaLETTERS [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
letters [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"
month.abb [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
month.name [1] "January" "February" "March" "April" "May" "June"
[7] "July" "August" "September" "October" "November" "December"
pi[1] 3.141593
state.abb [1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "FL" "GA" "HI" "ID" "IL" "IN" "IA"
[16] "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH" "NJ"
[31] "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT" "VT"
[46] "VA" "WA" "WV" "WI" "WY"
state.name [1] "Alabama" "Alaska" "Arizona" "Arkansas"
[5] "California" "Colorado" "Connecticut" "Delaware"
[9] "Florida" "Georgia" "Hawaii" "Idaho"
[13] "Illinois" "Indiana" "Iowa" "Kansas"
[17] "Kentucky" "Louisiana" "Maine" "Maryland"
[21] "Massachusetts" "Michigan" "Minnesota" "Mississippi"
[25] "Missouri" "Montana" "Nebraska" "Nevada"
[29] "New Hampshire" "New Jersey" "New Mexico" "New York"
[33] "North Carolina" "North Dakota" "Ohio" "Oklahoma"
[37] "Oregon" "Pennsylvania" "Rhode Island" "South Carolina"
[41] "South Dakota" "Tennessee" "Texas" "Utah"
[45] "Vermont" "Virginia" "Washington" "West Virginia"
[49] "Wisconsin" "Wyoming"
state.area [1] 51609 589757 113909 53104 158693 104247 5009 2057 58560 58876
[11] 6450 83557 56400 36291 56290 82264 40395 48523 33215 10577
[21] 8257 58216 84068 47716 69686 147138 77227 110540 9304 7836
[31] 121666 49576 52586 70665 41222 69919 96981 45333 1214 31055
[41] 77047 42244 267339 84916 9609 40815 68192 24181 56154 97914